Skip to content

docs: node scan rfc#922

Open
alegrey91 wants to merge 13 commits intokubewarden:mainfrom
alegrey91:docs/node-scanning-rfc
Open

docs: node scan rfc#922
alegrey91 wants to merge 13 commits intokubewarden:mainfrom
alegrey91:docs/node-scanning-rfc

Conversation

@alegrey91
Copy link
Copy Markdown
Collaborator

Description

Node Scan feature RFC.
Fix #889

Test

To test this pull request, you can run the following commands:

Additional Information

Tradeoff

Potential improvement

Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
@alegrey91 alegrey91 requested a review from a team as a code owner March 10, 2026 10:17
Copilot AI review requested due to automatic review settings March 10, 2026 10:17
@github-project-automation github-project-automation Bot moved this to Pending Review in SBOMscanner Mar 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an RFC describing the proposed “Node Scan” feature, outlining how node scanning should work and integrate with SBOMscanner, including intended CRDs and status conditions.

Changes:

  • Introduces a new RFC document for Node Scanning.
  • Describes the DaemonSet-based architecture and reuse of the worker via scan mode flagging.
  • Proposes new CRDs, a NodeMetadata structure, and NodeScanJob status conditions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread docs/rfc/0008_node_scan.md Outdated
Comment thread docs/rfc/0008_node_scan.md
Comment thread docs/rfc/0008_node_scan.md Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.22%. Comparing base (3a8bcd2) to head (f2fc011).
⚠️ Report is 267 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #922      +/-   ##
==========================================
+ Coverage   49.95%   52.22%   +2.26%     
==========================================
  Files          56       61       +5     
  Lines        4544     5147     +603     
==========================================
+ Hits         2270     2688     +418     
- Misses       1928     2071     +143     
- Partials      346      388      +42     
Flag Coverage Δ
unit-tests 52.22% <ø> (+2.26%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Flavio Castelli <flavio@castelli.me>
Copy link
Copy Markdown
Member

@flavio flavio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start. There are details that are not covered, such as:

  • The daemonset will stay idle most of the time
  • How is the scan initiated
  • How is the SBOM performed. We both know this is going to be based on trivy scanning the filesystem of the host, mounted into the container. Explain that. Also, we will probably need to provide a way to exclude some host directories from the scan. AFAIK there's the risk of trivy scanning the directories where the container root filesystem is splatted, leading to assigning CVEs of the container images to the host itself.
  • Alternative approach: we could find creative way to spin up and down the daemonsets on demand. But this something that requires more work compared to the initial proposal. We expect the agent to be idle most of the time, consuming very little resources. We can revisit the approach later on, when we have real data.

Comment thread docs/rfc/0008_node_scan.md
Comment thread docs/rfc/0008_node_scan.md Outdated
Comment thread docs/rfc/0008_node_scan.md
Comment thread docs/rfc/0008_node_scan.md Outdated
Co-authored-by: Flavio Castelli <flavio@castelli.me>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
@alegrey91
Copy link
Copy Markdown
Collaborator Author

  • Also, we will probably need to provide a way to exclude some host directories from the scan. AFAIK there's the risk of trivy scanning the directories where the container root filesystem is splatted, leading to assigning CVEs of the container images to the host itself.

You are right. The field skip in the NodeScanConfiguration does exactly this thing.

Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
@flavio flavio removed this from SBOMscanner Mar 13, 2026
Comment thread docs/rfc/0008_node_scan.md Outdated
Comment thread docs/rfc/0008_node_scan.md
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
@alegrey91 alegrey91 changed the title docs: node scanning rfc docs: node scan rfc Apr 3, 2026
@github-project-automation github-project-automation Bot moved this to Pending Review in SBOMscanner Apr 3, 2026
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>

## Garbage Collection

Garbage collection is crucial to prevent resource orphaning and to maintain a clean cluster state.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two distinct cleanup mechanisms to consider: the GC and the configuration cleanup.

Kubernetes garbage collection (Node deletion)

The owner reference chain is: Node → NodeScanJob, and Node → NodeSBOM → NodeVulnerabilityReport. When a Node is deleted, Kubernetes garbage collection cascades through the owner references and cleans up all node-related resources automatically.

Reconciler cleanup (NodeScanConfiguration disabled/deleted)

When the NodeScanConfiguration is disabled or removed, the reconciler must actively clean up NodeScanJobs and NodeSBOMs. NodeVulnerabilityReports are cascade-deleted for free since they are owned by their respective NodeSBOM.

If not specified, all the nodes are scanned.
* `skip`: A list of file/directory paths to be ignored.

* `NodeScanJob`: Represents a single execution of a node scan.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reuse the same retention mechanism used for ScanJobs: keep only the latest 10 NodeScanJobs per node.
See:

## ScanJob retention


* `Name` specifies the unique name of the node in the cluster.
* `Platform` specifies the OS + CPU architecture of the node. Example: linux/amd64, linux/arm64.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a section describing which reconcilers need to be introduced, along with a couple of paragraphs explaining their role in the design.

  • NodeScanJobReconciler
  • NodeScanReconciler
  • NodeScanRunner (as a runnable)

The NodeScanRunner filters nodes using the nodeSelector and platform criteria. The platform filter can be implemented using controller-runtime indexes.


## CRDs

For this feature we are going to add the following CRDs:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add example CRDs

* `NodeScanConfiguration`: Defines the global scan settings.
* `scanInterval`: Duration between automated scans.
If not specified, the `NodeScanJob` doesn't start.
* `nodeSelector`: Filter which nodes are scanned.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add platform(s) filter

Comment thread docs/rfc/0008_node_scan.md
If not specified, the `NodeScanJob` doesn't start.
* `nodeSelector`: Filter which nodes are scanned.
If not specified, all the nodes are scanned.
* `skip`: A list of file/directory paths to be ignored.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use something like this for skip patterns:

 # Gitignore-style patterns to exclude from filesystem scans.
  # Trailing "/" = directory, otherwise = file.
  skipPatterns:
    - "node_modules/"       # → --skip-dirs node_modules
    - "**/vendor/"          # → --skip-dirs (glob expanded at scan time)
    - ".git/"               # → --skip-dirs .git
    - "*.min.js"            # → --skip-files *.min.js
    - "package-lock.json"   # → --skip-files package-lock.json


## Scan Workflow

1. The user applies a `NodeScanConfiguration` with a defined `scanInterval` or applies a `NodeScanJob` manually.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revisit this and add the runner. That way we have two branches in the flow: one triggered by the user, and one triggered by the runner.

Signed-off-by: Alessio Greggi <alessio.greggi@suse.com>
@fabriziosestito fabriziosestito added the documentation Improvements or additions to documentation label Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

Status: Pending Review

Development

Successfully merging this pull request may close these issues.

RFC: Define how Node Scanning works

4 participants